Correction of likelihoods for degrees of freedom in robust speech recognition using missing feature theory
نویسنده
چکیده
In Missing Feature Theory (MFT), noise robustness of speech recognizers is obtained by modifying the likelihood computed by the acoustic model to express that some features extracted from the signal are unreliable or missing. In one implementation of MFT, the acoustic model and bounds on the unreliable feature are used to infer an estimate of the missing data. This paper addresses an observed bias of the likelihood evaluated at the estimate. Theoretical and experimental evidence are provided that an upper bound on the accuracy is improved by applying a computationally simple correction for the number of free variables in the likelihood maximization.
منابع مشابه
روشی جدید در بازشناسی مقاوم گفتار مبتنی بر دادگان مفقود با استفاده از شبکه عصبی دوسویه
Performance of speech recognition systems is greatly reduced when speech corrupted by noise. One common method for robust speech recognition systems is missing feature methods. In this way, the components in time - frequency representation of signal (Spectrogram) that present low signal to noise ratio (SNR), are tagged as missing and deleted then replaced by remained components and statistical ...
متن کاملTwo correction models for likelihoods in robust speech recognition using missing feature theory
In Missing Feature Theory (MFT), it is assumed that some of the features that are extracted from an observation are missing or unreliable. Applied to spectral features for noisy speech recognition, the clean feature values are known to be less than the observed noisy features. Based on this inequality constraint, an HMM-state-dependent clean speech value of the missing features can be inferred ...
متن کاملVisual Tracking using Kernel Projected Measurement and Log-Polar Transformation
Visual Servoing is generally contained of control and feature tracking. Study of previous methods shows that no attempt has been made to optimize these two parts together. In kernel based visual servoing method, the main objective is to combine and optimize these two parts together and to make an entire control loop. This main target is accomplished by using Lyapanov theory. A Lyapanov candidat...
متن کاملImproving the performance of MFCC for Persian robust speech recognition
The Mel Frequency cepstral coefficients are the most widely used feature in speech recognition but they are very sensitive to noise. In this paper to achieve a satisfactorily performance in Automatic Speech Recognition (ASR) applications we introduce a noise robust new set of MFCC vector estimated through following steps. First, spectral mean normalization is a pre-processing which applies to t...
متن کاملAn Information-Theoretic Discussion of Convolutional Bottleneck Features for Robust Speech Recognition
Convolutional Neural Networks (CNNs) have been shown their performance in speech recognition systems for extracting features, and also acoustic modeling. In addition, CNNs have been used for robust speech recognition and competitive results have been reported. Convolutive Bottleneck Network (CBN) is a kind of CNNs which has a bottleneck layer among its fully connected layers. The bottleneck fea...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2003